Conversation
…c upstream - Add proficiency banding, subject mapping, and metadata columns to int_pearson__all_assessments, int_fldoe__all_assessments, and int_iready__diagnostic_results - Add new int_pearson__student_list_report intermediate model - Add stg_google_sheets__state_test_comparison_demographics; disable old stg_google_sheets__state_test_comparison - Add standardized_discipline to base_powerschool__course_enrollments - Add new int_extracts__student_enrollments_courses model - Refactor int_extracts__student_enrollments_subjects to use upstream columns - Simplify rpt_tableau__state_assessments_dashboard and _comps by replacing inline CASE blocks with upstream column references - Update int_tableau__state_assessments_demographic_comps lineage to use int_pearson__student_list_report Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…g CTE - Switch state_comps CTE to stg_google_sheets__state_test_comparison_demographics - Move results_type, admin, season, subject, test_code upstream to int models - Rename test_code to aligned_test_code in int_pearson__student_list_report - Add admin and subject aliases to int_fldoe__all_assessments - Swap stg_pearson__student_list_report ref to int_pearson__student_list_report Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…ch flags Replace GROUP BY CUBE (1,024 combos) with explicit GROUPING SETS (12 combos) for ~85x reduction in computed groups. Consolidate the demographic comps intermediate chain from 2 models + macro into 1 model. Push demographic labels, comparison_entity, and test_code-derived columns upstream into the intermediate to simplify the reporting layer. Fix self-join bug that made region_matched/ region_outperformed flags dead columns. Add uniqueness tests to stg, int, and rpt models. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- dim_state_assessment_benchmarks.yml: keep expanded surrogate key description, use bare `- unique` test (inherits severity/store_failures from dbt_project.yml project-level defaults instead of repeating per-test) - base_powerschool__course_enrollments: keep both courses_credittype normalization (from main) and standardized_discipline (from this branch) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add results_type, district_state, admin, subject, illuminate_subject, and fast_aggregated_proficiency as computed columns in the kipptaf int_fldoe__all_assessments model instead of depending on the kippmiami upstream to provide them. This allows rpt_tableau__state_assessments_dashboard to build without waiting for a kippmiami deployment. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Refactor int_tableau__state_assessments_demographic_comps: replace self-contained assessment_scores CTE with three union branches (NJ official, NJ prelim, FL official); add test_code_metadata CTE from stg_google_sheets__state_test_comparison_demographics to replace inline school_level/grade_range_band/discipline CASE statements; fix aligned column references and unqualified ON clause columns - Add YML for int_fldoe__all_assessments (kipptaf): uniqueness test, model description, and full column definitions including new metadata columns (results_type, district_state, aligned_level_test_code, illuminate_subject, fast_aggregated_proficiency, is_proficient_int, admin, subject) - Add YML for int_pearson__all_assessments: uniqueness test, model description, full column definitions; fix stale columns (englishlearnerel, studentwithdisabilities removed; aligned_* demographic columns, is_proficient_int, season, admin added) - Add YML for int_pearson__student_list_report: uniqueness test on (source_relation, academic_year, administration, state_id, aligned_test_code), model description, full column definitions; fix missing trailing comma in SQL - Update stg_google_sheets__state_test_comparison_demographics YML: move data_tests before columns, remove redundant store_failures, add missing aligned_level_test_code column - Add aligned_gender to int_extracts__student_enrollments YML - Remove design spec (work complete) Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
- Add prelim score gating to int_tableau__state_assessments_demographic_comps: new prelim_assessments and valid_prelim_assessments CTEs automatically exclude NJ student list data for any (academic_year, assessment_name) already present in int_pearson__all_assessments Spring, eliminating the need to manually comment/uncomment the prelim branch each cycle - Qualify all column references in final SELECT with scores alias (s.) to satisfy RF02 after test_code_metadata join was introduced - Replace rolling 7-year window with fixed 2018 floor across all three score branches — 2018 is the earliest year with available comps data - Fix rpt_tableau__state_assessments_dashboard: inline-alias administration_window and assessment_subject from int_fldoe__all_assessments instead of expecting pre-computed admin/subject columns Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
- Strip stg_google_sheets__state_test_comparison_demographics to SELECT * so all columns (school_level, grade_range_band, discipline, aligned_level_test_code, etc.) come directly from the _v2 sheet range - Update sources-external.yml with _v2 sheet range Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…hics rename - Rename test_name → assessment_name, test_code → aligned_test_code in dim_state_assessment_benchmarks (SQL + contract YML) and state_comps CTE in rpt_tableau__state_assessments_dashboard - Rename comparison_demographic_group/subgroup_aligned → aligned_comparison_demographic_group/subgroup across all three downstream models - Fix join conditions in rpt_tableau__state_assessments_dashboard against state_comps CTE (all three score sections) - Add school_level to uniqueness test on stg_google_sheets__state_test_comparison_demographics — differentiates MS_HS source rows from synthetic HS ALG01 aggregate rows Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…raphics - Add custom_rows CTE to union source data with synthetic HS ALG01 totals (aggregated from HS_09 and HS_10 rows, which official comp sources do not provide as a combined HS total) - Rename derived columns: comparison_demographic_group/subgroup_aligned → aligned_comparison_demographic_group/subgroup - Update aligned_level_test_code derivation to use aligned_test_code instead of test_code Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
- Keep SELECT * from transformations CTE (main's explicit list was missing race_ethnicity computed in transformations) - Add testscorecomplete to union_relations include list, transformations CTE, and YML (new column introduced in main) - Keep full column descriptions from branch; reject main's stripped YML Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
|
Claude finished @cbini's task in 6m 26s —— View job PR Review: refactor(dbt): state assessments rebuild
This is a well-reasoned rebuild overall. The Critical1. Broken
when test_name like '%Mathematics%'
then concat('MAT', regexp_extract(test_name, r'.{6}(.{2})'))
when test_name like '%ELA%'
then concat('ELA', regexp_extract(test_name, r'.{6}(.{2})'))
When Important2. Dead NJ prelim UNION ALL branch in Lines 476-591 filter 3. The PR body acknowledges this, but the YML contains no 4. Race/ethnicity source asymmetry across union branches
The inline CASE in prelim/FL also lacks an 5. Lines 42-52: 6. YML: Missing Per
7. YML: Per convention: "Do not add
Both instances should be removed. Minor8. ALG01 HS aggregation sums percentages instead of weighted average (pre-existing, now load-bearing)
sum(percent_proficient) as percent_proficient,
sum(total_students) as total_students,This sums two percentage values (HS_09 + HS_10) rather than computing a weighted average. The outer SELECT then multiplies 9. Region self-join
Summary
Issues 1, 2, and 3 should be resolved before merge. Issues 4-7 should either be fixed or tracked in a follow-up issue with a reference added to the code. |
…nt, fix prelim gating, fix FAST source and ALG01 state comps fan-out - int_pearson__all_assessments: add is_504, is_approaching_int, is_below_int columns - int_fldoe__all_assessments (ktaf): fix achievement_level_int via performance_level, add is_approaching_int/is_below_int; filter PM3/PM1 to assessment_name = 'FAST' to prevent EOC/Science contamination - rpt_tableau__academic_goals_rollup: swap stg_fldoe__fast ref to int_fldoe__all_assessments - rpt_tableau__state_assessments_dashboard: add prelim score gating (auto-suppresses when official scores exist); fix state_comps join to use school_level instead of aligned_level_test_code to resolve ALG01 MS/HS fan-out - int_tableau__state_assessments_demographic_comps: fix prelim gate join to use admin instead of season; fix same ALG01 fan-out via admin column - rpt_tableau__state_assessments_dashboard_comps: update column refs to match renamed demographics staging columns Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…; remove store_failures from individual tests
- int_pearson__student_list_report: replace .{6}(.{2}) regex with
Grade (\d+) + lpad — fixes wrong aligned_test_code for grades 3-9
where single-digit grade names caused positional extraction to capture
a space character instead of the grade number
- int_tableau__state_assessments_demographic_comps.yml,
rpt_tableau__state_assessments_dashboard_comps.yml: remove
store_failures: true from individual uniqueness tests — project
default handles this
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
|
@claude - update your review |
|
Claude finished @GabyRangelB's task in 2m 54s —— View job PR Review: refactor(dbt): state assessments rebuild
Updated review — all 9 prior issues re-checked against current branch. Resolved since last review
Still open4. Race/ethnicity source asymmetry across union branches — Still present
The inline CASE handles 5. Both and pa.test_type = p.assessment_nameIf the Dagster 6. Missing Per
Minor (pre-existing, tracked)8. ALG01 HS aggregation sums percentages instead of weighted average — Still present
sum(percent_proficient) as percent_proficient,
sum(total_students) as total_students,
9. Region self-join
Summary
Issues 4–6 should be resolved or tracked before merge. Issues 8–9 are minor and pre-existing. |
…/delete conflict Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…from int_pearson and int_fldoe Replace inline subject alignment, test code normalization, proficiency flag computations, and period mapping in downstream consumers of int_pearson__all_assessments and int_fldoe__all_assessments with the pre-computed columns now available on both models. Also adds model + column descriptions to int_tableau__state_assessments_demographic_comps. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Downstream model changes — notice to model ownersThe new columns added to Each change was validated by comparing output row counts and metric values against production BigQuery before and after — zero data differences in all cases except where noted. @awalters — Anthony Walters
|
| Removed | Replaced with |
|---|---|
case when subject like 'English%' then 'Reading' ... (NJ) |
if(illuminate_subject = 'Text Study', 'Reading', 'Math') |
if(assessment_subject = 'English Language Arts', 'Reading', 'Math') (FL ×2) |
if(illuminate_subject = 'Text Study', 'Reading', 'Math') |
Validation: Math/Reading row counts unchanged. 16-row total diff predates this change (FL/iReady/Star upstream data).
Next steps: is_proficient_int, is_approaching_int, is_below_int already read directly from upstream — no further action needed.
rpt_tableau__miami_fast
What changed: is_proficient was a FLOAT64 computed inline. Now reads is_proficient_int (INT64) directly; contract YAML updated to match:
| Removed | Replaced with |
|---|---|
case ft.is_proficient when true then 1.0 when false then 0.0 end as is_proficient |
ft.is_proficient_int as is_proficient |
data_type: float64 in YML |
data_type: int64 |
Validation: Zero diff at 39,427 rows. Tableau treats INT64 and FLOAT64 identically for numeric fields.
Next steps: The remaining if(... and fl.is_proficient ...) boolean checks in the disparity/growth CTEs cannot be replaced with is_proficient_int directly, but fast_aggregated_proficiency, sublevel_number, scale_for_proficiency, and points_to_proficiency are now all pre-computed on int_fldoe__all_assessments and may simplify those CTEs further.
rpt_gsheets__kippmiami_payout_roster (currently disabled)
What changed: Two aggregation expressions recomputed is_proficient_int inline:
| Removed | Replaced with |
|---|---|
round(avg(if(is_proficient, 1, 0)), 2) ×2 |
round(avg(is_proficient_int), 2) ×2 |
Validation: Pre-change query confirmed equivalence for all groups — zero mismatches.
Next steps when re-enabling: fast_aggregated_proficiency, sublevel_number, scale_for_proficiency, points_to_proficiency, scale_for_growth, and points_to_growth are now pre-computed on int_fldoe__all_assessments and may simplify the growth and disparity CTEs.
@cbini — Charlie Bini
int_topline__state_assessments_weekly
What changed: Both branches computed is_proficient_int inline. Both now read the column directly:
| Removed | Replaced with |
|---|---|
case when fl.is_proficient then 1 when not fl.is_proficient then 0 end (FL) |
fl.is_proficient_int |
case when p.is_proficient then 1 when not p.is_proficient then 0 end (NJ) |
p.is_proficient_int |
Validation: Proficient / not-proficient / total counts by region (Camden, Miami, Newark) — zero diff.
dim_assessments
What changed: The state_nj CTE re-derived subject area and test code inline:
| Removed | Replaced with |
|---|---|
if(subject = 'English Language Arts/Literacy', 'English Language Arts', subject) |
aligned_subject |
case testcode when 'SC05' then 'SCI05' ... end |
aligned_test_code |
Validation: Zero diff across all 24 subject/module combinations.
fct_assessment_scores_enrollment_scoped
What changed: The state_nj CTE computed subject alignment, test code normalization, and period mapping inline:
| Removed | Replaced with |
|---|---|
if(subject = 'English Language Arts/Literacy', 'English Language Arts', subject) |
aligned_subject |
case testcode when 'SC05' then 'SCI05' ... end |
aligned_test_code |
if(period = 'FallBlock', 'Fall', period) ×2 |
admin |
Validation: Zero diff across all 23 subject/module combinations. Surrogate key stability confirmed (administration_window values unchanged).
Next steps: The state_fl CTE could be similarly simplified — int_fldoe__all_assessments now exposes is_proficient_int, is_approaching_int, is_below_int, and illuminate_subject directly.
Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
|
@claude please update your review |
|
Claude finished @GabyRangelB's task in 1m 51s —— View job PR Review: refactor(dbt): state assessments rebuild
Updated review — all 9 prior issues re-checked against current branch. Resolved since last review
Still open4. Race/ethnicity source asymmetry and missing
Both inline 5. Both and pa.test_type = p.assessment_nameIf the Dagster 6. Missing model and column descriptions — Per Minor (pre-existing, tracked)8. ALG01 HS aggregation sums percentages instead of weighted average — Still present
sum(percent_proficient) as percent_proficient,This sums HS_09 + HS_10 percentages rather than computing a weighted average. The outer 9. Region self-join
Summary
Issues 4 (missing |
…est_type coupling comment, and comps yml descriptions Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
|
@claude please update your review |
|
Claude finished @GabyRangelB's task in 2m 23s —— View job PR Review: refactor(dbt): state assessments rebuild
Updated review — all prior issues re-checked against current branch. Resolved since last review
Full model description and all 16 column descriptions are now present. All three YML files from the original review now have complete descriptions. Still open4. Race/ethnicity source asymmetry — missing The 5.
Minor (pre-existing, tracked)8. ALG01 HS aggregation sums percentages instead of weighted average — Still present
sum(percent_proficient) as percent_proficient,This sums HS_09 + HS_10 percentages rather than a weighted average. The outer 9. Region self-join
Summary
Issues 4 and 5 are the only open blockers. Both are one-line fixes. Issues 8–9 are minor and pre-existing. |
…sments_dashboard Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
|
Issue 5 fixed. On issue 4: we intentionally left out |
Pull Request
Summary & Motivation
When merged, this pull request will rebuild the NJ and FL state assessment reporting layer. The core motivation is to move proficiency banding, demographic labeling, and assessment metadata computation upstream into intermediate models — so that reporting models become thin joins rather than logic-heavy CTEs. It also introduces a new Google Sheets comparison demographics source, consolidates the demographic comps pipeline, and adds several correctness fixes discovered during validation.
`stg_google_sheets__state_test_comparison_demographics`
`int_pearson__all_assessments`
`int_pearson__student_list_report` (new)
`int_fldoe__all_assessments` (kipptaf)
`int_tableau__state_assessments_demographic_comps`
`rpt_tableau__state_assessments_dashboard`
`rpt_tableau__state_assessments_dashboard_comps`
`rpt_tableau__academic_goals_rollup`
`base_powerschool__course_enrollments`
`int_extracts__student_enrollments_subjects`
New model: `int_extracts__student_enrollments_courses`
YMLs
AI Assistance
This PR was largely Claude Code-assisted under human direction. Human directed the feature requirements and data model decisions (which fields to add, how prelim gating should work, the school_level join approach for ALG01, the GROUPING SETS refactor). Claude Code handled implementation, debugging, and row-count validation against production BigQuery tables.
Self-review
General
dbt
CI checks
Troubleshooting